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Abstract: In the first of this pair of papers, it was proven that there cannot be a physical computer 
to which one can properly pose any and all computational tasks concerning the physical universe. 
It was then further proven that no physical computer C can correctly carry out all computational 
tasks that can be posed to C. As a particular example, this result means that no physical computer 
that can, for any physical system external to that computer, take the specification of that external 
system's state as input and then correctly predict its future state before that future state actually 
occurs; one cannot build a physical computer that can be assured of correctly "processing infor- 
mation faster than the universe does". These results do not rely on systems that are infinite, and/or 
non-classical, and/or obey chaotic dynamics. They also hold even if one uses an infinitely fast, 
infinitely dense computer, with computational powers greater than that of a Turing Machine. This 
generality is a direct consequence of the fact that a novel definition of computation — "physical 
computation" — is needed to address the issues considered in these papers, which concern real 
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physical computers. While this novel definition does not fit into the traditional Chomsky hierar- 
chy, the mathematical structure and impossibility results associated with it have parallels in the 
mathematics of the Chomsky hierarchy. This second paper of the pair presents a preliminary 
exploration of some of this mathematical structure. Analogues of Chomskian results concerning 
universal Turing Machines and the Halting theorem are derived, as are results concerning the 
(im)possibility of certain kinds of error-correcting codes. In addition, an analogue of algorithmic 
information complexity, "prediction complexity", is elaborated. A task-independent bound is 
derived on how much the prediction complexity of a computational task can differ for two differ- 
ent reference universal physical computers used to solve that task, a bound similar to the "encod- 
ing" bound governing how much the algorithm information complexity of a Turing machine 
calculation can differ for two reference universal Turing machines. Finally, it is proven that either 
the Hamiltonian of our universe proscribes a certain type of computation, or prediction complex- 
ity is unique (unlike algorithmic information complexity), in that there is one and only version of 
it that can be applicable throughout our universe. 



3 



INTRODUCTION 

Recently there has been heightened interest in the relationship between physics and computa- 
tion ([1-33]). This interest extends far beyond the topic of quantum computation. On the one 
hand, physics has been used to investigate the limits on computation imposed by operating com- 
puters in the real physical universe. Conversely, there has been speculation concerning the limits 
imposed on the physical universe (or at least imposed on our models of the physical universe) by 
the need for the universe to process information, as computers do. 

To investigate this second issue one would like to know what fundamental distinctions, if any, 
there are between the physical universe and a physical computer. To address this issue the first of 
this pair of papers begins by establishing that the universe cannot contain a computer to which one 
can pose any arbitrary computational task. Accordingly, paper I goes on to consider computer- 
indexed subsets of computational tasks, where all the members of any such subset can be posed to 
the associated computer. It then proves that one cannot build a computer that can "process infor- 
mation faster than the universe". More precisely, it is shown that one cannot build a computer that 
can, for any physical system, correctly predict any aspect of that system's future state before that 
future state actually occurs. This is true even if the prediction problem is restricted to be from the 
set of computational tasks that can be posed to the computer. 

This asymmetry in computational speeds constitutes a fundamental distinction between the 
universe and the set of all physical computers. Its existence casts an interesting light on the ideas 
of Fredkin, Landauer and others concerning whether the universe "is" a computer, whether there 
are "information-processing restrictions" on the laws of physics, etc. [10, 18]. In a certain sense, 
the universe is more powerful than any information-processing system constructed within it could 
be. This result can alternatively be viewed as a restriction on the universe as a whole — the uni- 
verse cannot support the existence within it of a computer that can process information as fast as it 
can. 

The analysis of paper I also establishes (for example) the necessarily fallible nature of retrod- 
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iction, of control, and of observation. (This latter result can be viewed as a kind of uncertainty 
principle that does not rely on quantum mechanics.) The way that results of such generality are 
derived is by examining the underlying issues from the broad perspective of the computational 
character of physical systems in general, rather than that of some single precisely specified physi- 
cal system. The associated mathematics does not directly involve dynamical systems like Turing 
machines. Rather it casts computation in terms of partitions of the space of possible worldlines of 
the universe. For example, to specify what input a particular physical computer has at a particular 
time is to specify a particular subset of all possible worldlines of the universe; different inputs to 
the computation correspond to different such subsets. Similar partitions specify outputs of a phys- 
ical computer. Results concerning the (im)possibility of certain kinds of physical computation are 
derived by considering the relationship between these kinds of partitions. In its being defined in 
terms of such partitions, "physical computation" involves a structure that need not even be instan- 
tiated in some particular physically localized apparatus; the formal definition of a physical com- 
puter is general enough to also include more subtle non-localized dynamical processes unfolding 
across the entire universe. 

This second paper begins with a cursory review of these partition-based definitions and 
results of paper I. Despite its being distinct from the mathematics of the Chomsky hierarchy, as 
elaborated below, the mathematics and impossibility results governing these partitions bears many 
parallels with that of the Chomsky hierarchy. Section 2 of this second paper explicates some of 
that mathematical structure, involving topics ranging from error correction to the (lack of) transi- 
tivity of computational predictability across multiple distinct computers. In particular, results are 
presented concerning physical computation analogues of the mathematics of Turing machines, 
e.g., "universal" physical computers, and Halting theorems for physical computers. In addition, an 
analogue of algorithmic information complexity, "prediction complexity", is elaborated. A task- 
independent bound is derived on how much the prediction complexity of a computational task can 
differ for two different reference universal physical computers used to solve that task. This bound 
is similar to the "encoding" bound governing how much the algorithmic information complexity 



5 



of a Turing machine calculation can differ for two reference universal Turing machines. It is then 
proven that one of two cases must hold. One is that the Hamiltonian of our universe proscribes a 
certain type of computation. The other possibility is that, unlike conventional algorithmic infor- 
mation complexity, its physical computation analogue is unique, in that there is one and only ver- 
sion of it that can be applicable throughout our universe. 

Throughout these papers, B = {0, 1 }, 9t is defined to be the set of all real numbers, ' A ' is the 
logical and operator, and 'NOT' is the logical not operator applied to B. To avoid proliferation of 
symbols, often set-delineating curly brackets will be used surrounding a single symbol, in which 
case that symbol is to taken to be a variable with the indicated set being the set of all values of that 
variable. So for example "{y}" refers to the set of all values of the variable y. In addition o(A) is 
the cardinality of any set A, and 2 A is the power set of A. u e U are the possible states of the uni- 

A A A 

verse, and U is the space of allowed trajectories through U. So u e U is a single- valued map from 
t g 9t to u e U, with u t = u t the state of the universe at time t. Note that since the universe is 

A 

microscopically deterministic, u t for any t uniquely specifies u. Sometimes there will be implicit 

A 

constraints on U. For example, we will assume in discussing any particular computer that the 

A A 

space U is restricted to worldlines u that contain that computer. An earlier analysis addressing 
some of the issues considered in this pair of papers can be found in [30]. 



I. REVIEW OF DEFINITIONS AND FOUNDATIONAL RESULTS RELATED TO 
PHYSICAL COMPUTATION 

In paper I the process by which real physical computers make predictions concerning physical 
systems is abstracted to produce a mathematical definition of physical computation. This section 
reviews that definition and the associated fundamental mathematical results. The reader is 
referred to paper I for more extensive discussion of the definitions. 
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i) Definition of a Physical Computer 

We start by distinguishing the specification of what we want the computer to calculate from 
the results of that calculation: 

Definition 1: Any question qe Q is a pair, consisting of a set A of answers and a single-valued 

A A 

function from ue U to a e A. A(q) indicates the A-component of the pair q. 
Here we restrict attention to Q that are non-empty and such that there exist at least two elements 
in A(q) for at least one qe Q. We make no other a priori assumptions concerning the spaces 
{A(q e Q)} and Q. In particular, we make no assumptions concerning their finiteness. 

Example 1 (conventional prediction of the future): Say that our universe contains a system S 
external to our computer that is closed in the time interval [0, T], and let u be the values of the ele- 
ments of a set of canonical variables describing the universe, a is the t = T values of the compo- 
nents of u that concern S, measured on some finite grid G of finite precision, q is this definition of 
a with G and the like fully specified. (So q is a partition of the space of possible u T , and therefore 

A 

of U, and a is an element of that partition.) Q is a set of such q's, differing in G, whose associated 
answers our computer can (we hope) predict correctly. 

The input to the computer is implicitly reflected in its t = physical state, as our interpretation 
of that state. In this example (though not necessarily in general), that input specifies what question 
we want answered, i.e., which q and associated T we are interested in. It also delineates one of 

A 

several regions RcU, each of which, intuitively, gives the t = state of S. Throughout each such 
R, the system S is closed from the rest of the universe during t e [0, T]. The precise R delineated 
further specifies a set of possible values of u (and therefore of the Hamiltonian describing S), for 
example by being an element of a (perhaps irregular) finite precision grid over U , G. If, for some 
R, q( u ) has the same value for all u e R, then this input R uniquely specifies what a is for any 
associated u. If this is not the case, then the R input to the computer does not suffice to answer 
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question q. So for any q and region R both of which can be specified in the computer's input, R 
must be a subset of a region q _1 (a) for some a. 

Implicit in this definition is some means for correctly getting the information R into the com- 
puter's input. In practice, this is often done by having had the computer coupled to S sometime 
before time 0. As an alternative, rather than specify R in the input, we could have the input contain 
a "pointer" telling the computer where to look to get the information R. (The analysis of these 
papers holds no matter how the computer gains access to R.) In addition, in practice the input, giv- 
ing R, q, and T, is an element of a partition over an "input section" of our computer. In such a 

A 

case, the input is itself an element of a finite precision grid over U, G". So an element of G" spec- 
ifies an element of G (namely q) and element of G (namely R.) 

Given its input, the computer (tries to) form its prediction for a by first running the laws of 
physics on a u having the specified value as measured on G, according to the specified Hamilto- 
nian, up to the specified time T. The computer then applies q(.) to the result. Finally, it writes this 
prediction for a onto its output and halts. (More precisely, using some fourth finite precision grid 
G" over its output section, it "writes out" (what we interpret as) its prediction for what region in 
U the universe will be in at T, that prediction being formally equivalent to a prediction of a region 

A 

in U.) The goal is to have it do this, with the correct value of a, by time x < T. Note that to have 
the computer's output be meaningful, it must specify the question q being answered as well as the 
answer a, i.e., the output must be a physical state of the computer that we interpret as a question- 
answer pair. 

Consider again the case where there is in fact a correct prediction, i.e., where R is indeed a 
subset of the region q -1 (oc) for some a. For this case, formally speaking, "all the computer has to 
do" in making its prediction is recognize which such region in the partition q that is input to the 
computer contains the region R that is also input to the computer. Then it must output the label of 
that region in q. In practice though, q and R are usually "encoded" differently, and the computer 
must "translate" between those encodings to recognize which region q _1 (a) contains R; this trans- 
lation constitutes the "computation". 
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Given this definition of a question, we can now define the input and output portions of a phys- 
ical computer by generalizing our example of conventional computation. 

A A 

Definition 2: i) A (computation) partition is a set of disjoint subsets of u whose union equals U, 

A 

or equivalently a single-valued mapping from U into a non-empty space of partition-element 
labels. Unless stated otherwise, any partition is assumed to contain at least two elements. 

ii) In an output partition, the space of partition element labels is a space of possible "outputs", 
{OUT}. 

iii) In a physical computer, we require {OUT} to be the space of all pairs {OUT q e Q, OUT a e 
A(OUT q )}, for some Q and A(.) as defined in Def. (1). This space — and therefore the associated 
output partition — is implicitly a function of Q. To make this explicit, often, rather than an output 
partition, we will consider the full associated double (Q, OUT(.)), where OUT(.) is the output par- 
tition u g U — > OUT e {OUT q e Q, OUT a e A(OUT q )}. Also, we will find it useful to use an 

A A 

output partition to define an associated ("prediction") partition, OUT p (.) : u — > (A(OUT q ( u ), 
OUT a (u)). 

iv) In an input partition, the space of partition element labels is a space of possible "inputs", 
{IN}. 

v) A (physical) computer consists of an input partition and an output partition double. Unless 
explicitly stated otherwise, both of those partitions are required to be (separately) surjective. 

Since we are restricting attention to non-empty Q, {OUT} is non-empty. We say that OUT q is the 
"question posed to the computer", and OUT a is "the computer's answer". The surjectivity of IN(.) 
and OUT(.) is a restriction on {IN} and {OUT}, respectively. 

While motivated in large measure by the task of predicting the future, the definition of physi- 
cal computation is far broader, concerning any computation that can be cast in terms of inputs, 
questions about physical states of nature, and associated answers. This set of questions includes in 
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particular any calculation that can be instantiated in a physical system in our universe, whether 
that question is a "prediction" or not. All such physically realizable calculations are subject to the 
results presented below. 

Even in the context of prediction though, the definition of a physical computer presented here 
is much broader than computers that work by the process outlined in Ex. 1 (and therefore the 
associated theorems are correspondingly further-ranging in their implications). For example, the 
computer in Ex. 1 has the laws of physics explicitly built into its "program". But our definition 
allows other kinds of "programs" as well. Our definition also allows other kinds of information 
input to the computer besides q and a region R (which together with T constitute the inputs in Ex. 
1). As discussed in paper I, we will only need to require that there be some t = state of the com- 
puter that, by accident or by design, induces the correct prediction at t = x. This means we do not 
even require that the computer's initial state IN "accurately describes" the t = external universe 
in any meaningful sense. Our generalization of Ex. 1 preserves analogues of the grids G (in Q(.)), 
G" (in IN(.)) and G" (in OUT(.)), but not of the grid G. 

In fact, our formal definition of a physical computer broadens what we mean by the "input to 
the computer", IN, even further. While the motivation for our definition, exemplified in Ex. 1, has 
the partition IN(.) "fix the initial state of the computer's inputs section", that need not be the case. 

A A 

IN(.) can reflect any attributes of u. An "input" — an element of a partition of U — need not 
even involve the t = state of the physical computer. In other words, as we use the terms here, the 
computer's "input" need not be specified in some t = state of a physical device. Indeed, our def- 
inition does not even explicitly delineate the particular physical system within the universe that 
we identify with the computer. (A physical computer is simply an input partition together with an 
output partition.) This means we can even choose to have the entire universe "be the computer". 
For our purposes, we do not need tighter restrictions in our definition of a physical computer. 
Nonetheless, a pedagogically useful example is any localized physical device in the real world 
meeting our limited restrictions. No matter how that device works, it is subject to the impossibility 
results described below. 
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ii) Intelligible computation 

Consider a "conventional" physical computer, consisting of an underlying physical system 
whose t = state sets IN( u ) and whose state at time x sets OUT( u ), as in our example above. 
We wish to analyze whether the physical system underlying that computer can calculate the future 
sufficiently quickly. In doing so, we do not want to allow any of the "computational load" of the 
calculation to be "hidden" in a restriction on the possible questions. Our computer possess a suffi- 
cient degree of flexibility. We impose this via the following construction (see paper I for a detailed 
justification): 

Definition 3: Consider a physical computer C = (Q, IN(.), OUT(.)) and a U-partition 7L A func- 

A 

tion from U into B, f, is an intelligibility function (for tc) if 

V U, U' G U, 7l( U ) = 7C( U' ) => f( U ) = f( u' ). 

A set F of such intelligibility functions is an intelligibility set for n. 

A 

We view any intelligibility function as a question by defining A(f) to be the image of U under 
f. If F is an intelligibility set for 71 and FcQ,we say that 7C is intelligible to C with respect to F. If 
the intelligibility set is not specified, it is implicitly understood to be the set of all intelligibility 
functions for 7t. 

We say that two physical computers C and C are mutually intelligible (with respect to the 
pair {F 1 }) iff both OUT 2 is intelligible to C 1 with respect to F 2 and OUT 1 is intelligible to C 2 with 
respect to F 1 . 

Plugging in, 7C is intelligible to C iff V intelligibility functions f , 3 q e OUT q such that q = f, i.e., 

A A A A A 

such that A(q) = the image of U under f, and such that Vug U, q( u ) = f( u ). Note that since 7C 
contains at least two elements, if 7C is intelligible to C, 3 OUT q e {OUT q } such that A(OUT q ) = 
B, an OUT q such that A(OUT q ) = {0}, and one such that A(OUT q ) = { 1 }. Usually we are inter- 
ested in the case where 7C is an output partition of a physical computer, as in mutual intelligibility. 
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Intuitively, an intelligibility function for a partition 71 is a mapping from the elements of 7C into 
B. 7C is intelligible to C if Q contains all binary-valued functions of 7C, i.e., if C can have posed any 
question concerning the universe as measured on 7C. This flexibility in C ensures that C's output 
partition isn't "rigged ahead of time" in favor of some particular question concerning 7C. Formally, 

A A A 

by the surjectivity of OUT(.), the requirement of intelligibility means that 3 u' e U such that V u 
e U,[OUT q ( u')](u) = f(u). 

iii) Predictable computation 

We can now formalize the concept of a physical computer's "making a correct prediction": 

Definition 4: Consider a physical computer C, partition 7C, and intelligibility set for 7t, F. We say 
that 7C is weakly predictable to C with respect to F iff: 

i) 7C is intelligible to C with respect to F, i.e., F c OUT q ; 

ii) V f e F, 3 IN g {IN} that weakly induces f, i.e., an IN such that: 

IN( u ) = IN 
=> 

OUT p ( u ) = (A(OUT q ( u )), OUT a ( u )) = (A(f), f( u )). 

Intuitively, condition (ii) means that for all questions q in F, there is an input state such that if C is 
initialized to that input state, C's answer to that question q (as evaluated at x) must be correct. 
Note that we even allow the computer to be mistaken about what question it is answering — i.e., 

A 

for OUT q ( u ) to not equal f — so long as C's answer is correct. We will say a computer C with 
output OUT'(.) is weakly predictable to another if the partition OUT' p (.) is. If we just say "predict- 
able" it will be assumed that we mean weak predictability. 

As a formal matter, note that in the definition of predictable, even though f(.) is surjective onto 
A(f) (cf. Def. 3), it may be that for some IN, the set of values f( u ) takes on when u is restricted 

A 

so that IN( u ) = IN do not cover all of A(f). The reader should also bear in mind that by surjectiv- 
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ity, V IN g {IN}, 3ueU such that IN( u ) = IN. 

iv) Distinguishable computers 

There is one final definition that we need before we can establish our unpredictability results: 

Definition 5: Consider a set of n physical computers {C 1 = (Q 1 , IN 1 (.)> OUT^.)) : i = 1, n}. We 
say {C 1 } is (input) distinguishable iff V n-tuples (IN 1 e {IN 1 }, IN n e {IN n }), 3 u e U such 
that V i, IN\ u ) = IN 1 simultaneously. 

We say that {C 1 } is pairwise (input) distinguishable if any pair of computers from {C 1 } is distin- 
guishable, and will sometimes say that any two such computers C 1 and C 2 "are distinguishable 
from each other". We will also say that {C 1 } is a maximal (pairwise) distinguishable set if there 
are no physical computers C g {C 1 } such that Cu {C 1 } is a (pairwise) distinguishable set. 

iv) The impossibility of posing arbitrary questions to a computer 

The first result in paper I states that for any pair of physical computers there are always 
binary-valued questions about the state of the universe that cannot even be posed to at least one of 
those physical computers: 

Theorem 1: Consider any pair of physical computers {C 1 : i = 1, 2}. Either 3 finite intelligibility 

9 9 9 1 9 

set F for C such that C is not intelligible to C with respect to F , and/or 3 finite intelligibility 
set F 1 for C 1 such that C 1 is not intelligible to C 2 with respect to F 1 . 

Thm. 1 reflects the fact that while we do not want to have C's output partition "rigged ahead of 
time" in favor of some single question, we also cannot require too much flexibility of our com- 
puter. It is necessary to balance these two considerations. Accordingly, before analyzing predic- 
tion of the future, to circumvent Thm. 1 we must define a restricted kind of intelligibility set to 
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which Thm. 1 does not apply: 



Definition 6: An intelligibility function f for an output partition OUT(.) is question-independent 
iffVu, u'e ft: 

OUT p ( u ) = OUT p ( u' ) 
=> 

f(u) = f(u'). 

An intelligibility set as a whole is question-independent if all its elements are. 

We write C 1 > C 2 (or equivalently C 2 < C 1 ) and say simply that C 2 is (weakly) predictable to 
C 1 (or equivalently that C 1 can predict C 2 ) if C 2 is weakly predictable to C 1 for all question-inde- 
pendent finite intelligibility sets for C . 

9 1 

Similarly, from now on we will say that C is intelligible to C without specification of an 

9 1 

intelligibility set if C is intelligible to C with respect to all question-independent finite intelligi- 
bility sets for C 2 . 



Intuitively, f is question-independent if its value does not vary with q among any set of q all of 
which share the same A(q). As an example, say our physical computer is a conventional digital 
workstation. Have a certain section of the workstation's RAM be designated the "output section" 
of that workstation. That output section is further divided into a "question subsection" designating 
(i.e., "containing") a q, and an "answer subsection" designating an a. Say that for all q that can be 
designated by the question subsection A(q) is a single bit, i.e., we are only interested in binary- 
valued questions. Then for a question-independent f, the value of f can only depend on whether 
the answer subsection contains a or a 1 . It cannot vary with the contents of the question subsec- 
tion. 

A detailed example of a pair of mutually (question-independent) intelligible computers is pre- 
sented in paper I. In addition to this explicit demonstration that Thm. 1 does not hold for question- 
independent intelligibility sets, examples 2, 2', and 2" of paper I establish that: 
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a) There are pairs of input-distinguishable physical computers, C , C , in which C is predictable 
toC^C^C 2 ; 

b) Given C 1 and C 2 as in (a), we could have yet another computer C 3 that also predicts C 2 (i.e., 
such that C > C ) while being distinguishable from C ; 

c) Given C 1 and C 2 as in (a), we could have a computer C 4 , distinguishable from both C 1 and C 2 , 
where C 4 > C 1 , so that C 4 > C 1 > C 2 . We can do this either with C 4 > C 2 or not. 

ii) The impossibility of assuredly correct prediction 

To establish our main impossibility result in paper I we started with the following lemma: 

Lemma 1: Consider a physical computer C . If 3 any output partition OUT that is intelligible to 
C 1 , then 3 q 1 e Q 1 such that A(q*) = B, a q 1 e Q 1 such that A(q J ) = {0}, and a q 1 e Q 1 such that 
A(q 1 )={l}. 

This can be used to establish paper Fs central theorem: 

Theorem 2: Consider any pair of distinguishable physical computers {C 1 : i = 1, 2}. It is not pos- 
sible that both C 1 > C 2 and C 1 < C 2 . 

Restating it, Thm. 2 says that either 3 finite question-independent intelligibility set for C 1 , F 1 , 
such that C 1 is not predictable to C 2 with respect to F 1 , and/or 3 finite question-independent intel- 
ligibility set for C 2 , F 2 , such that C 2 is not predictable to C 1 with respect to F 2 . 

Thm. 2 holds no matter how large and powerful our computers are; it even holds if the "phys- 
ical system underlying" one or both of our computers is the whole universe. It also holds if instead 
C is the rest of the physical universe external to C . A set of implications of Thm. 2 for various 
kinds of physical prediction scenarios are discussed in paper I. As also discussed there, impossi- 
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bility results that are in some senses even stronger than those associated with Thm. 2 hold when 
we do not restrict ourselves to distinguishable computers, as we do in Thm. 2. 

3. THE MATHEMATICAL STRUCTURE RELATING PHYSICAL COMPUTERS 

There is a rich mathematical structure governing the possible predictability relationships 
among sets of physical computers, especially if one relaxes the presumption (obtaining in much of 
paper I) that the universe can contain multiple copies of C. This section presents some of that 
structure. 

i) The graphical structure over a set of computers induced by weak predictability 

While it directly concerns pairs of physical computers, Thm. 2 also has implications for the 
predictability relationships within sets of more than two computers. An example is the following: 

Corollary 1: It is not possible to have a fully distinguishable set of n physical computers {C 1 } 
such that C 1 >C 2 >... >C n >C 1 . 

Proof: Hypothesize that the corollary is wrong. Define the composite device C* = (IN*(.) = 
n^" 1 IN\.), Q 1 , OUT^.)). Since {C} is fully distinguishable, IN*(.) is surjective. Therefore C* 
is a physical computer. 

Since by hypothesis C n is intelligible to C n ~\ 3 OUT nl q such that ACOUT 11 "^) = B. Also, 
since C n " 2 > C n "\ 3 IN n " 2 e {IN n " 2 } such that V u e U for which A(0UT n4 q ( u )) = B, 
IN n " 2 ( u ) = IN n " 2 OUT n 2 a ( u ) = OUT n l a ( u ). Iterating and exploiting full distinguishabil- 
ity, 3 (IN 1 , IN n " 2 ) such that Vug U for which A^UT 11 " 1 ^ u )) = B, (IN : ( u ), .., IN n " 2 ( u )) 
= (IN 1 , IN n " 2 ) OUT*( u ) = OUT : ( u ) = OUT n_1 ( u ). The same holds when we restrict u 
so that the space A(OUT n l q ( u )) = { 1 }, and when we restrict u so that A(OUT n l q ( u )) = {0}. 

Since by hypothesis C n is intelligible to C n "\ and since IN*(.) is surjective, this result means 
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that C n is predictable to C . Conversely, since C n > C by hypothesis, the output partition of C* is 
predictable to C n , and therefore C* is. Finally, since {C 1 } is fully distinguishable, C* and C n are 
distinguishable. Therefore Thm. 2 applies, and by using our hypothesis we arrive at a contradic- 
tion. QED. 

What are the general conditions under which two computers can be predictable to one 
another? By Thm. 2, we know they aren't if they're input-distinguishable. What about if they're 
one and the same? No physical computer is input-distinguishable from itself, so Thm. 2 doesn't 
apply to this issue. However it still turns out that Thm. 2's implication holds for this issue: 

Theorem 3: No physical computer is predictable to itself. 

Proof. Assume our corollary is wrong, and some computer C is predictable to itself. Since by def- 
inition predictability implies intelligibility, we can apply Lemma 1 to establish that there is a q e 
OUT q , q', such that A(q') = B. Therefore one question-independent intelligibility function for C is 

A A A A 

the function f from ue U^B that equals 1 if A(OUT q ( u )) = B and OUT a ( u ) = 0, and equals 
otherwise. Therefore by hypothesis 3 IN e {IN} such that IN( u ) = IN => A(OUT q ( u )) = B 
and OUT a ( u ) = f( u ). But if A(OUT q ( u )) = B, then f ( u ) = NOT[OUT a ( u )], by definition of 

A A A 

f(.). Since IN is surjective, this means that there is at least one u e U such that A(OUT q ( u )) = B 
and OUT a ( u ) = NOT[OUT a ( u )]. This is impossible. QED. 

Intuitively, this result holds due to the fact that a computer cannot make as its prediction the logi- 
cal inverse of its prediction. An important corollary of this result is that no output partition is pre- 
dictable to a physical computer that has that output partition. Combining Thm. 3 and Coroll. 1 and 
identifying the predictability relationship with an edge in a graph, we see that fully distinguish- 
able sets of physical computers constitute (unions of) directed acyclic graphs. 



17 



ii) Weak predictability and variants of error correction 

When considering sets of more than two computers, it is important to realize that while it is 
symmetric, the input-distinguishability relation need not be transitive. Accordingly, separate pair- 
wise distinguishable sets of computers may partially "overlap" one another. Similarly, stipulating 
the values of the inputs of any two computers in a pairwise-distinguishable set may force some of 
the other computers in that set to have a particular input value. 

Coroll. 1 does not apply to such a set. As it turns out though, Thm. 2 still has strong implica- 
tions even for a set of more than two computers that is not fully distinguishable, so long as the set 
is pairwise distinguishable. Define a god computer as any physical computer in a pairwise distin- 
guishable set such that all other physical computers in that set are predictable to the god computer. 
Then by Thm. 2, each such set can contain at most one god computer. There is at most one com- 
puter in any pairwise distinguishable set that can correctly predict the future of all other members 
of that set, and more generally at most one that can accurately predict the past of, observe, and/or 
control any system in that set (see paper I). In particular, for any human being physical computer, 
for any pairwise distinguishable set of computers including that human, there can be at most one 
god computer. (Lest one read too much into the phrase "god computer", note that like any other 
computer, a god computer is merely a set of partitions, and need not correspond to any localized 
physical apparatus.) 

Even a god computer may not be able to correctly predict all other computers in its distin- 
guishable set simultaneously. The input value it needs to adopt to correctly predict some C may 
preclude it from correctly predicting some C 3 and vice-versa. One way to analyze this issue is to 
consider a composite partition OUT defined by the output partitions of C and C . We can then 
investigate whether and when our god computer can weakly predict the composite output parti- 
tion. The following definition formalizes this: 

Definition 7: Consider a pairwise distinguishable set {C 1 } with god computer C 1 . Define the par- 
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titions OUT ix J ( u e U ) = (OUT ix J ( u ), OUT $ ( u )), where each answer map OUT ^ ( u ) s 
(OUT 1 ^ u ), OUT 2 a ( u )), and each question [OUT 1 ^ ( u )] = the mapping given by u' e U 
([0\JT l q ( u )]( u' ), [OUT 2 q ( u )]( u' )). Then C 1 is omniscient if OUT 2x3x " is weakly predictable 
toC 1 . 

Intuitively, OUT ix J is just the double partition (0171*0, OUT(.)) = ((OUT^.), OUTFOX 
(OUT J q (.), OUT J a (.)), re-expressed to be in terms of a single question-valued partition and a sin- 
gle answer-valued partition. To motivate this re-expression, for any two questions q 1 e Q 1 and q J e 
QJ, let q 1 x q J be the ordered product of the partitions q 1 and q J ; it is the partition assigning to every 
point u' e U the label (q\ u' ), q j ( u' )). Then if OUT^ u ) is the question q 1 and OUT j q ( u ) is 
the question qj, OUT 1 ^ 1 ( u ) is the question q 1 x qK OUT lX ^ is defined similarly, only with one 
fewer levels of "indirection", since answer components of output partitions are not themselves 
partitions (unlike question components). 

Note that even though any OUT^.) and OUT J (.) are both surjective mappings, OUT lXj need 
not be surjective onto the set of quadruples {q 1 e Q 1 , q J e Q 1 , a 1 e A(Q J ), a J g ACQ 1 )}. It is 
straight-forward to verify that an omniscient computer is a god computer. 

In general, one might presume that two non-god computers in a pairwise-distinguishable set 
could have the property that, while individually they cannot predict everything, considered jointly 
they would constitute a god computer, if only they could work cooperatively. An example of such 
cooperativity would be having one of the computers predict when the other one's prediction is 
wrong. It turns out though that under some circumstances the mere presence of some other com- 
puter in that pairwise distinguishable set may make such error-correction impossible, if that other 
computer is omniscient. 

As an example of this, say we have three pair-wise distinguishable computers C 1 , C 2 , C 3 , 
where C 3 always answers with a bit (i.e., 3 q 3 e Q 3 such that A(q 3 ) d B). We will want C 2 's out- 
put to "correct" C 3 's predictions, and have those predictions potentially concern C 1 . So have C 1 
be intelligible to C 3 . As a technical condition, assume not only that C 3 's output can be any of its 
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possible question-answer pairs, but also that for any of its questions, for any of the associated pos- 
sible answers, there are situations where that answer is correct (so that C 2 should leave C 3 's 
answer alone in those situations). Then it turns out that due to Thm. 2, if C 1 is omniscient, it is not 
possible that C 2 always correctly outputs a bit saying whether C 3 's answer is the correct response 
to C s question. More formally, 

Corollary 2: Consider three pair-wise distinguishable computers C , C , C , where 3 q g Q 
such that A(q ) d B. Assume that C is an omniscient computer, and that C is intelligible to C . 
Finally, assume that V pairs (q 3 e Q 3 , a 3 e A(q 3 )), 3 u e U such that both OUT 3 q ( u ) = q 3 and 
q 3 ( u ) = a 3 (i.e., [OUT 3 q ( u )]( u ) = a 3 ). Then it is not possible that V u e U, OUT 2 a ( u ) = 1 
if [OUT 3 q ( u )]( u ) = OUT 3 a ( u ), otherwise. 

Proof: Hypothesize that the corollary is wrong. Construct a composite device C , starting by 
having IN 2 " 3 (.) = OUT 3 q (.), Q 2 " 3 = Q 3 and OUT 2 " 3 q (.) = OUT 3 q (.). Next define the question by 
the rule 0( u ) = NOT[OUT 3 a ( u )] if OUT 2 a ( u ) = 0, 0( u ) = OUT 3 a ( u ) otherwise. (N.b. no 
assumption is made that e Q 2 " 3 .) To complete the definition of the composite computer C 2 " 3 , 
have OUT 2 " 3 a ( u ) = 0( u ). 

AAA OA A 

Now by our hypothesis, Vug U, 0( u ) = [OUT q ( u )]( u ). By the last of the conditions 
specified in the corollary, this means that V (q 2 " 3 g Q 2 " 3 , a 2 " 3 e A(q 2 " 3 )), 3 u such that 
OUT 2 " 3 q ( u ) = q 2 " 3 and OUT 2 " 3 a ( u ) = a 2 " 3 . So C 2 " 3 allows all possible values of {OUT 2 " 3 }, as 
a physical computer must. Due to surjectivity of OUT 3 q , it also allows all possible values of the 
space {IN 2 3 }. To complete the proof that C 2 " 3 is a (surjective) physical computer, we must estab- 
lish that OUT 2 " 3 a ( u ) g A(OUT 2 " 3 q ( u )) V u g U. To do this note that if for example 
A(OUT 2 " 3 q ( u )) = A(OUT 3 q ( u )) = { 1 }, then since it is always the case that the OUT 2 " 3 a ( u ) = 
[OUT 2 " 3 q ( u )]( u ) = [OUT 3 q ( u )]( u ), OUT 2 " 3 a ( u ) = 1. Similarly OUT 2 " 3 a ( u ) g 
A(OUT 2 " 3 q ( u )) when A(OUT 2 " 3 q ( u )) = {0}. Finally, if A(OUT 2 " 3 q ( u )) = B, then the simple 
fact that OUT 2 " 3 a ( u ) g B always means that OUT 2 " 3 a ( u ) g A(OUT 2 " 3 q ( u )). 
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Since C 1 is intelligible to C 3 and Q 2 3 = Q 3 , C 1 is intelligible to C 2 " 3 . Moreover, given any 
question q 2 " 3 e Q 2 " 3 , 3 associated IN 2 " 3 e {IN 2-3 } such that Vug U for which IN 2 ~ 3 ( u ) = 
IN 2 " 3 , OUT 2 " 3 ( u ) = q 2 " 3 . But as was just shown, OUT 2 3 a ( u ) = q 2 " 3 ( u ) for that u. Therefore 
C 1 is predictable to C 2 " 3 . 

Next, since C 1 is omniscient, OUT 2x3 is intelligible to C 1 . Therefore any binary function of 
the regions defined by quadruples (A(OUT 2 q ( u )), A(OUT 3 q ( u )), OUT 2 a ( u ), OUT 3 a ( u )) is 
an element of Q 1 . Any single such region is wholly contained in one region defined by the pair 

9 Q A 9_^ A _ 

(A(OUT q ( u )), OUT a ( u )) though. Therefore any binary function of the regions defined by 
such pairs is an element of Q . Therefore C is intelligible to Q . Similarly, the value of any 
such binary function must be given by OUT 1 ^ u ) whenever IN^ u ) equals some associated IN 1 . 
So C 2 " 3 is predictable to C 1 . 

Finally, since C and C are input-distinguishable, so are C and C , and therefore Thm. 2 
applies. This establishes that our hypothesis results in a contradiction. QED. 

This result even holds if OUT 2x3 is only intelligible to C 1 , without necessarily being predictable 
to it. 

Coroll. 2 can be viewed as a restriction on the efficacy of any error correction scheme in the 
presence of a (distinguishable) omniscient computer. There are other restrictions that hold even in 
the absence of such a third computer. An example is the following implication of Thm. 2: 

Corollary 3: Consider two distinguishable mutually intelligible physical computers C 1 and C 2 , 
where both A(OUT 1 q ) c B and A(OUT 2 q ) cBV OUT^ e Q 1 and OUT 2 q e Q 2 . It is impossible 

1 9 

that C and C are "anti-predictable" to each other, in the sense that for each of them, the predic- 
tion they make concerning the state of the other can always be made to be wrong by appropriate 
choice of input. 

Proof: By assumption C 1 and C 2 are mutually intelligible. So what we must establish is whether 
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for both of them, for all intelligibility functions concerning the other one, there exists an appropri- 
ate value of IN 1 such that that intelligibility function is incorrectly predicted. 

Hypothesize that the corollary is wrong. Then V question-independent intelligibility functions 
for C 1 , f 1 , 3 IN 2 e {IN 2 } such that IN 2 ( u ) = IN 2 implies that [A(OUT 2 q ( u )) = NOT^f 1 )]] A 
[OUT 2 a ( u ) = NOTff^ u )]]. However by definition of question-independent intelligibility func- 
tions, given any such f 1 , there must be another question-independent intelligibility function for 
C 1 , f 3 , defined by f 3 (.) = NOT(f 1 (.)). Therefore 3 IN 2 e {IN 2 } such that IN 2 ( u ) = IN 2 implies 
that [A(OUT 2 q ( u )) = A(f 3 )] * [OUT 2 a ( u ) = f 3 ( u )]. 

This NOT(.) transformation bijectively maps the set of all question-independent intelligibility 
functions for C onto itself. Since that set is finite, this means that the image of the set under the 
NOT(.) transformation is the set itself. Therefore our hypothesis means that all question-indepen- 
dent functions for C 1 can be predicted correctly by C 2 for appropriate choice of IN 2 e {IN 2 }. By 
similar reasoning, we see that C 1 can always predict C 2 correctly. Since C 1 and C 2 are distinguish- 
able, we can now apply Thm. 2 and arrive at a contradiction. QED. 

iii) Strong predictability 

At the other end of the spectrum from distinguishable computers is the case where one com- 
puter's input can fix another's, either by being observed by that other computer or by setting that 
other computer's input more directly. The following variant of predictability captures this rela- 
tionship: 

19 9 

Definition 8: Consider a pair of physical computers C and C . We say that C is strongly predict- 

ii 9 19 

able to C (or equivalently that C can strongly predict C ), and write C » C (or equivalently 
C 2 «C 1 )iff: 

i) C 2 is intelligible to C 1 ; 

9 1 9 9 

ii) V question-independent intelligibility functions for C , q , V IN e {IN }, 
3 IN 1 e {IN 1 } that strongly induces the pair (q 1 , IN 2 ), i.e., such that: 
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nsr 1 c u)= in 1 



[OUT 1 ^ u ) = (A(q l ), q\ u ))] * [i N 2 ( * ) = IN 2]_ 



Intuitively, if C can strongly predict C , then for any IN and associated implication OUT p — 

9 1 

for any computation C might undertake — there is an input to C that is uniquely associated 
with IN 2 and that causes C 1 to output (any desired question-independent intelligibility function 
of) OUT p. Intuitively, there is some invertible "translating" map that takes C s input and 
"encodes" it in C^'s input, in such a way that C 1 can "emulate" C 2 running on C 2 's input, and 
thereby produce C 2 's associated output. In this way C 1 can emulate C 2 , much like universal Tur- 
ing machines can emulate other Turing machines. (Recall the definition of universal Turing 
machine, and see the definition of a universal physical computer below.) 

Strong predictability of a computer implies weak predictability of that computer. (Unlike with 
weak predictability, there is no such thing as strong predictability of a partition.) So for example 
both Thm. 3 and Coroll. 1 still hold if they are changed by replacing weak predictability with 
strong predictability. However weak predictability does not imply strong predictability. Moreover, 
the mathematics for sets of physical computers some of which are strongly predictable to each 
other (and therefore not distinguishable) differs in some respects from that when all the computers 
are distinguishable (the usual context for investigations of weak predictability). An example is the 
following result, which shows that strong predictability always is transitive, unlike weak predict- 
ability (cf. Ex. 2" in paper I): 

19^ 'X 

Theorem 4: Consider three physical computers {C , C , C }, and a partition 71, where both C 
and 7C are intelligible to C 1 . 

i) C 1 »C 2 >7C^C 1 >tc; 

ii) C 1 » C 2 » C 3 => C 1 » C 3 . 
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Proof: To prove (i), let f be any question-independent intelligibility function for %. By Lemma 1, 
the everywhere 0-valued question-independent intelligibility function of K is contained in Q 1 , and 
since C 1 > C 2 , there must be an IN 1 such that IN ! ( u ) = IN 1 ^> OUT 1 ^ u ) = 0. The same is true 
for the everywhere 1 -valued function. Therefore to prove the claim we need only establish that for 
every question-independent intelligibility function for n, f, for which A(f) = B, f e Q 1 , and there 
exists an IN 1 such that IN*( u ) = IN 1 => OUT 1 ^ u ) = f( u ). Restrict attention to such f from 
now on. 

9 9 9 

Define a question-independent intelligibility function for C , I , such that A(I ) = B, and such 
that for all u for which A(OUT q ( u )) = B, I 2 ( u ) = OUT 2 a ( u ). (Note that since C 2 > n, there 
both exist u for which OUT 2 p ( u ) = (B, 1) and u such that OUT 2 p ( u ) = (B, 0.) Now by hypoth- 
esis, for any of the f we are considering, 3 IN 2 f e {IN 2 } such that IN 2 ( u ) = IN 2 f OUT 2 p ( u ) 
= (B, f( u )). However the fact that C 1 » C 2 => 3 IN 1 e {IN 1 } such that TN l ( u ) = IN 1 => 
IN 2 ( u ) = IN 2 f and such that Om\( u ) = (A(I 2 ), I 2 ( u )) = (B, I 2 ( u )). Since IN 2 ( u ) = IN 2 f for 
such a u, A(OUT 2 a ( u )) = B, and therefore I 2 ( u ) = OUT 2 a ( u ). So OUT 2 p ( u ) for such a u 
equals (B, OUT 2 a ( u )). So for that IN 1 , OUT l p ( u ) = (A(f), f( u )). 

This establishes (i). The proof for (ii) goes similarly, with the redefinition that IN^ fixes the 
value of IN 3 as well as ensuring that OUT 2 p ( u ) = (A(f), f( u )). QED. 

Strong predictability obeys the following result which is analogous to both Thm.'s 2 and 3: 

Theorem 5: Consider any pair of physical computers {C 1 : i = 1, 2} . It is not possible that both C 1 
» C 2 and C 1 « C 2 . 

9 9 

Proof: Choose any IN . For any question-independent intelligibility function of OUT p , f, there 
must exist an IN ! f e {IN 1 } that strongly induces IN 2 and f, since C 1 » C 2 . Label any such IN 1 as 
IN ! f (IN 2 being implicitly fixed). So for any such f , { u : IN : ( u ) = IN : f } c { u : IN 2 ( u ) = IN 2 } . 
However since OUT 2 is not empty, there are at least two question-independent intelligibility 
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functions of OUT p , fj and f 2 , where A(fj) ^ A(f 2 ) (cf. Lemma 1). Moreover, the intersection 
{ u : m\ u ) = IN 1 ^} n { u : IN ! ( u ) = IN 1 ^} = 0, since these two sets induce different 
ACOUT 1 ^ (namely A(fj) and A(f 2 ), respectively). This means that { u : IN l ( u ) = IN 1 ^} c 
{ u : IN 2 ( u ) = IN 2 }. On the other hand, for the same reasons, there must also exist an IN 2 that 
strongly induces IN 1 ^. Therefore 3 IN 2 ' such that { u : IN 2 ( u ) = IN 2 '} c { u : IN l ( u ) = 
IN : fl } . So { u : IN 2 ( u ) = IN 2 '} c { u : IN 2 ( u ) = IN 2 } . This is not compatible with the fact that 
IN 2 (.) is a partition. QED. 

Many of the conditions in the preceding results can be weakened and the associated conclu- 
sions still hold. Indeed, this is even true for Thm. 2, where we can weaken the definition of "intel- 
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ligibility" and still establish the impossibility of having both C > C and C > C . (For example, 
that impossibility will still obtain even if neither C nor C contains B-valued questions, if they 
instead contain all possible functions mapping each others' values of OUT p onto {0, 1, 2}.) These 
weakened version are usually more obscure though, which is why they are not presented here. 

iv) Physical computation analogues of Halting theorems in Turing machine theory 

There are several ways that one can relate the mathematical structure of physical computation 
to that of conventional computer science. Here we sketch the salient concepts for one such rela- 
tion coupling physical computation and the mathematical structure governing Turing machines 
(TMs). 

A TM is a device that takes in an input string on an input tape, then based on it produces 
a sequence of output strings, either "halting" at some time with a final output string, or never halt- 
ing. If desired, the fact that the halt state has / hasn't been entered by any time can be reflected in 
a special associated pattern in the output string, in which case the sequence of output strings can 
always be taken to be infinite. As explicated above, in the real world inputs and (sequences of) 

A 

outputs are elements of partitions of U. So in one translation of TMs to physical computers, 
strings on tapes are replaced with elements of the partitions IN(.) and OUT(.). Rather than 
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through a set of internal states, read/write operations, state-transition rules, etc., the transforma- 
tion of inputs to outputs in a physical computer is achieved simply through the definition of the 
pair of an associated input partition and output partition. For a TM that declares in its output 
string whether it has halted, the physical computation analogue of whether a computation will 
ever halt is simply whether u is in some special subset of {OUT}. Although not formally 
required, in the real world IN(.) and OUT(.) usually differ. In this they are analogous to TM's with 
multiple tapes rather than conventional single-tape TMs. 

An alternative to identifying the full output partition of a physical computer with a TM's out- 

A 

put tape, motivated by the definition of predictability, is to identify the coarser partition u — > 

A 

OUT p ( u ) with a TM's output tape. (This is loosely analogous to a TM's being able to overwrite 
the "question" originally posed on its tape when producing its "answer" on that tape.) We will 
adopt this identification from now on, and use it to identify the physical computation analogue of 
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a TM as an input partition together with the surjective mapping u — > OUT p ( u ) of an associated 
output partition. 

This identification motivates several analogues of the Halting theorem. Since whether a partic- 
ular physical computer C 2 "halts" or not can be translated into whether its output is in a particular 
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region, the question of whether C halts is a particular intelligibility function of C . Correctly 

9 9 

answering the question of whether C halts means predicting that intelligibility function of C . In 
the context of physical computation it is natural to broaden the issue to concern all intelligibility 
functions of C . Accordingly, in this analogue of the claim resolved for TM's (in the negative) by 
the Halting theorem, one asks if it is possible to construct a physical computer C 1 that can predict 
any computer C 2 . To answer this, consider the case where C 2 is a copy of C 1 (cf. Def. 2(v) of 
paper I for a formal definition of a physical computer's "copy"). Then by applying Thm.'s 2, 3 and 
5, one sees that the answer is no, in agreement with the Halting theorem. (See also Coroll. 3.) 

There exist a number of alternative physical computer analogues of the Halting problem. 
Though not pursued at length here, it is worth briefly presenting one such alternative. This alterna- 
tive is motivated by arguing that, in the real world, one is not interested so much in whether the 
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computation will ever "halt", but rather whether the associated output is "correct". If we take 
"correct" to be relative to a particular question, this motivates the following alternative analogue 
of the Halting theorem: 

Theorem 6: Given a set of physical computers {C}, 3 C 1 e {C} such that VC 2 e {C}, 

i) C 2 is intelligible to C 1 ; 

ii) V q 2 e Q 2 , 3 IN 1 e {IN 1 } such that IN l ( u ) = IN 1 => OUT 1 ^ u ) = 1 iff q 2 ( u ) = 
OUT 2 a (u). 

Proof: Choose C 2 such that OUT 2 (.) = OUT^.). (If need be, to do this simply choose C 2 = C 1 .) 
Then in particular, OUT 1 a (.) = OUT 2 a (.). Now since C 2 is intelligible to C 1 by hypothesis, by 
Lemma 1 3 q 1 e Q 1 such that A(q ! ) = {0}, and therefore 3 q 2 e Q 2 such that A(q 2 ) = {0}. For 
that q 2 , OUT 1 ^ u ) = 1 iff = OUT 1 ^ u ), which is impossible. QED. 

1 9 9 1 9 

A TM T can emulate a TM T if for any input for T , T produces the same output as T 
when given an appropriately modified version of that input. (Typically, the "modification" 
involves pre -pending an encoding of T to that input.) The analogous concept for a physical com- 
puter is strong predictability; o ne physical computer can "emulate" another if it can strongly pre- 
dict that other one. Intuitively, the two components of T b s emulating T 2 , involving T 2 's input and 
its computational behavior, respectively, correspond to the two components of the requirement 
concerning IN 1 values that occur in the definition of strong predictability. The requirement con- 
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cerning IN values that is imposed by ensuring that OUT p ( u ) = (A(q), q( u )) for any q (that is 
an intelligibility function) for C is analogous to encoding (the computational behavior of) the 

9 11 

TM T in a string provided to the emulating TM, T . Requiring as well that the value IN ensures 
that IN 2 ( u ) = IN 2 is analogous to also including an "appropriately modified" version of T 2 's 
input in the string provided to T 1 . (Note that any mapping taking IN 2 e {IN 2 } to an IN 1 that in 
turn induces that starting IN 2 is invertible, by construction.) This motivates the following defini- 
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tion of the analogue of a universal TM: 

Definition 9: A universal physical computer for a set of physical computers is a member of that 
set that can strongly predict all other members of that set. 

Note that rather than reproduce the output of a computer it is strongly predicting, a universal 
physical computer produces the value of an intelligibility function applied to that output. This 
allows the computers in our set to have different output spaces from the universal physical com- 
puter. However it contrasts with the situation with conventional TM's, being a generalization of 
such TM's. 

v) Prediction complexity 

In computer science theory, given a universal TM T, the algorithmic complexity of an output 
string s is defined as the length of the smallest input string s' that when input to T produces s as 
output. To construct our physical computation analogue of this, we need to define the "length" of 
an input region of a physical computer. To do this we start with the following pair of definitions: 

Definition 10: For any physical computer C with input space {IN}: 

i) Given any partition 71, a (weak) prediction input set (of C, for tc) is any set s c {IN} such 
that both every intelligibility function for 71 is weakly induced by an element of s, and for any 
proper subset of s at least one such function is not weakly induced. We write the space of all weak 
prediction input sets of C for % as C _1 (tc). 

ii) Given any other physical computer C' with input space {IN'} for which the set of all ques- 
tion-independent intelligibility functions is {f }, a strong prediction input set of C, for the triple 
C, in' c {IN'}, and/' c {f }, is any set s cz {IN} such that both every pair (f e / ', IN e in') is 
strongly induced by a member of s, and for any proper subset of s at least one such pair is not 
strongly induced. We write the space of all strong prediction input sets (of C, for C', in', and/ ') as 
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C-\C,in',f). 

Intuitively, the prediction set of C for 71 / C is a minimal subset of {IN} that is needed by C for 71 / 
C to be predictable to C. In the case of strong prediction, we provide the associated definition the 
extra flexibility of being able to restrict what intelligibility functions are being considered. 

Now, to define the physical computation analogue of algorithmic information complexity, 
identify the "length of an input string" with the negative logarithm of the volume of a subset of the 
partition IN(.): 

Definition 11: Given a physical computer C and a measure d(l over U: 

A A A 

i) Define V(in c {IN}) as the measure of the set of all u e U such that IN( u ) e in, and define the 
length of in (with respect to IN(.)) as \(in) = -ln[V(m)]; 

ii) Given a partition 71 that is predictable to a physical computer C, define the prediction complex- 
ity of 71 (with respect to C), c(7t I C), as min p G c _1 (7r) P(P)]- 

We are primarily interested in prediction complexities of binary partitions, in particular of the 
binary partitions induced by the separate single elements of multi-element partitions. (The binary 
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partition induced by some p e 7t' is { u s.t. 7t'( u ) = p, u s.t. 7t'( u ) ^ p}.) To see what Def. ll(ii) 

A 

means for such a partition, say you are given some set a c U (i.e., you are given a binary partition 

A 

of U). Suppose further that you wish to know whether the universe is in o, and you have some 
computer C to use to answer (all four intelligibility functions of) this question. Then loosely 
speaking, the prediction complexity of o with respect to C is the minimal amount of Shannon 
information that must be imposed in C's inputs in order to be assured that C's output correctly 
answers that question. In particular, if a corresponds to a potential future state of some system S 
external to C, then c(o I C) is a measure of how difficult it is for C to predict that future state of 
S. ls 

In many situations it will be most natural to choose d(i to be uniform over accessible phase 
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space volume, so that the complexity of in is the negative physical entropy of constraining u to lie 
in in. But that need not be the case. For example, we can instead define d(i so that the volume of 
each element of the associated {IN} is some arbitrary positive real number. In this case, the 
lengths of the elements of {IN} provides us with an arbitrary ordering over those elements. 

The following example illustrates the connection between lengths of regions in and lengths of 
strings in TM's: 

Example 3: In a conventional computer (see Ex. 1 above), we can define a "partial string" s 
(sometimes called a "file") taking up the beginning of an input section as the set of all "complete 
strings" taking up the entire input section whose beginning is s. We can then identify the input to 
the computer as such a partial string in its input section. (Typically, there would be a special fixed- 
size "length of partial string" region even earlier, at the very beginning of the input section, telling 
the computer how much of the complete string to read to get that partial string.) If we append cer- 
tain bits to s to get a new longer input partial string, s', the set of complete strings consistent with 
s' is a proper subset of the set of complete strings consistent with s. Assuming our measure d|i is 
independent of the contents of the "length of partial string" region, this means that l(s') > l(s). 

This is in accord with the usual definition of the length of a string used in Turing machine the- 
ory. Indeed, if s' contains n more bits than does s, then there are 2 n times as many complete strings 
consistent with s as there are consistent with s'. Accordingly, if we take logarithms to have base 2, 
l(s') = l(s) + n. 

A 

Say we want our computer to be able to predict whether u lies in some set a. (To maintain the 
analogy with Turing machines, o could delineate an "output partial string". This could be done for 
example by delineating a particular OUT p value, perhaps even one in some other computer.) In 

A A 

the usual way, this corresponds to having the binary partition { u e a, u £ a} be weakly predict- 
able to our computer. So the prediction complexity of that prediction is the length of the shortest 
region of our input space that will weakly induce that prediction. (Note that since we require that 
all four intelligibility functions of a be induced, more than one input "partial string" is required 
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for that induction, in general.) 

The fact that OUT p values specify the set A(OUT q ) makes working with Def.'s 10 and 11 a bit 
messy. In particular, to relate prediction complexity to properties of the associated universal phys- 
ical computer we must use a set of "identity" intelligibility functions defined as follows: 

Definition 12 (i): Given a space XcB and a physical computer C with input and output spaces 
{IN} and {OUT} respectively, 

jl x) is the set of all question-independent intelligibility functions of C where A(I x ) = X, 
and where V u such that A(OUT q ( u )) = X, I c x ( u ) = OUT a ( u ). 

We also will need the following definition: 

Definition 12 (ii): Given a space XcB and a physical computer C with input and output spaces 
{IN} and {OUT} respectively, 

when X is a set C _1 (X) is also a set, defined as those IN e {IN} such that IN( u ) = IN => 
A(OUT q ( u)) = X. 

So for example, if X = B, a pair (IN 2 e [C 2 ] _1 (X), I 2 X e {I 2 X }) is an input to C 2 and an intelligi- 

9 9 

bility function of C s output, respectively. That input IN induces an associated output question, 
q e OUT q , that takes on (both) B values as one varies over the u input to it. Similarly, the intel- 
ligibility function IN 2 X takes on (both) B values as one varies over the inputs to it. 

Using these definitions, we now bound how much more complex a partition can appear to C 1 
than to C 2 if C 1 can strongly predict C 2 . Though somewhat forbidding in appearance, intuitively, 
the bound simply reflects the complexity cost of "encoding" C 2 in C^s input. 

Theorem 8: Given any partition n and physical computers C 1 and C 2 where C 1 » C 2 > n, 
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i) c(n\C l ) - c(tcIC 2 ) < 

lnfa^ 71 )] - ln[3] + 

max {X£ b 5 IN 2 G [C 2 r l (XX ^ {I 2 x} , 1[ (C l )\c 2 , IN 2 , 1 2 X ) ] - 
min {X <_ B , in 2 e [cV(X)} 1[ IN 2 ] , 

or alternatively, 

ii) c^lC 1 ) - c(7ilC 2 ) < 

lnfa^ 71 )] + 

min {X _ B , in 2 e [C 2 T l (X), IV d 2 X } > 1[ (C 1 )" 1 ^ 2 ' IN ^ l2 x) 1 " 

min {Xj=B,IN 2 e[C 2 ]- 1 (X)} 1[IN 2 ] . 



Proof: Given any intelligibility function f for 71, consider any IN f e {IN } that weakly induces f, 
i.e., such that IN 2 ( u ) = IN 2 f => OUT 2 p ( u ) = (A(f), f( u )). (The analysis will not be affected if 71 
is an output partition and we restrict attention to those intelligibility functions for 7t that are ques- 
tion-independent.) Since C 1 » C 2 , we can then choose an IN 1 , IN^IN 2 ^, to strongly induce IN 2 f 
together with any question-independent intelligibility function of OUT p . (Indeed, in general 
there can be more than one such value of IN 1 that induces IN 2 f .) So in particular, we can choose it 
so that the vector OUT p ( u ) = (A(I A(f))> I A(f)( u )) f° r an y possible function I A(f)- Now for 
that IN 1 , IN 2 ( u ) = IN 2 f , and therefore A(OUT 2 q ( u )) = A(f), which means that I 2 A(f) ( u ) = 
OUT 2 a ( u ), which in turn equals f( u ) for that IN 2 . So V u such that IN ! ( u ) = IN ! f(IN 2 f ), 
OUT ! p ( \x ) = (A(f), f( u )). In other words, IN^IN^) weakly induces in C 1 the same intelligibil- 
ity function for 7t that IN 2 f weakly induces in C 2 . However since IN : ( u ) = INVlN 2 f ) => 
IN 2 f< u ) = IN 2 f , the set ofue U such that IN : ( u ) = INViN 2 f ) i s c the set such that IN 2 ( u ) = 
IN 2 f . This means that KIN^iN^)) > l(IN 2 f ). (Our task, loosely speaking, is to bound this differ- 
ence in lengths, and then to extend the analysis to simultaneously consider all such question-inde- 
pendent intelligibility functions f.) 

Take {fj to be the set of all intelligibility functions for 71. By the preceding construction, tc is 
weakly predictable to C 1 with a (not necessarily proper) subset of {IN^IN 2 ^)} being a member 
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of (C 1 ) Now any member of (C 1 ) must contain at least three disjoint elements, corre- 
sponding to intelligibility functions q with A(OUT 1 q ( u )) = B, {0}, or { 1 }. (See the discussion 
just before Lemma 1.) Accordingly, the volume (as measured by d(i) of any subset of 
{IN^IN 2 ^)} e (C l ) l (%) must be at least 3 times the volume of the element of {IN^IN 2 ^)} hav- 

1 9 1 - ] 

ing the smallest volume. In other words, the length of any subset of {IN f .(IN f .)} e (C ) (7t) 
must be at most -ln(3) plus the length of the longest element of {IN^IN 2 ^)}. Therefore c(7t I C 1 ) 

< max fi [KIN^IN^.))] - ln(3). 

Now take {IN 2 f .} to be the set in (C 2 ) _1 (7t) with minimal length. {IN 2 f .} has at most 0(2") dis- 
joint elements, one for each intelligibility function for 7L Using the relation minjfgj] = -maxj [-gj, 
this means that c(7t I C 2 ) > -ln[o(2™)] + min f . [l(IN 2 f .)]. Therefore we can write c(7t I C 1 ) - c(7l I C 2 ) 

< ln[o(2")] - ln(3) + max f . [KIN^IN^.))] - min f . [l(IN 2 f .)]. The fact that for all IN 2 f ., IN 2 ( u ) = 
IN 2 f . => A(OUT 2 q ( u )) = A(fi) c B completes the proof of (i). 

To prove (ii), note that we can always construct one of the sets in (C 1 ) by starting with the 

1 9 

set consisting of the element of {IN f .(IN f .)} having the shortest length, and then successively 
adding other IN 1 values to that set, until we get a full (weak) prediction set. Therefore c(7t I C 1 ) < 

1 9 

min f . 1(IN f .(IN f .)). Using this bound rather than the one involving -ln(3) establishes (ii). QED. 

9-1 9 

Note that the set of X e B such that [C ] (X) exists must be non-empty, since C > 71. Simi- 

9 A A _ 9 

larly, C > 71 means that there is a u such that A(OUT q ( u)) = XcB. The associated I x always 

ry A — A A A 

exists by construction: simply define I x ( u ) = OUT a ( u ) V u such that A(OUT q ( u )) = X, and 

A ~ A 

for all other u, I x ( u ) = x for some xeX. Therefore the extrema in our bounds are always well- 
defined. 

1 9 

As one varies 71, in both bounds in Thm. 8 the dependence of the bound on C and C does not 
change. In addition, those bounds are independent of 71 for all n sharing the same cardinality. So in 
particular they are independent of 71 for all binary partitions like those discussed in Ex. 3. This 
illustrates how Thm. 7 is the physical computation analogue of the result in Turing machine the- 
ory that the difference in algorithmic complexity of a fixed string with respect to two separate Tur- 
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ing machines is bounded by the complexity of "emulating" the one Turing machine on the other, 
independent of the fixed string in question. 

Consider the possibility that for the laws of physics in our universe, there exist partitions IN(.) 
and OUT(.) that constitute a universal physical computer C* for all other physical computers in 
our universe. Then by Thm. 5, no other computer is similarly universal. Therefore there exists a 
unique prediction complexity measure that is applicable to all physical computers in our universe, 
namely complexity with respect to C*. (This contrasts with the case of algorithmic information 
complexity, where there is an arbitrariness in the choice of the universal TM used.) If instead there 
is no universal physical computer in our universe, then every physical computer C must fail at 
least once at (strongly) predicting some other physical computer. (Note that unlike the case with 
weak predictability considered in Thm. 2, here we aren't requiring that the universe be capable of 
having two distinguishable versions of C.) This establishes the following: 

Theorem 9: Either infallible strong prediction is impossible in our universe, or there is a unique 
complexity measure in our universe. 

Similar conclusions hold if one restricts attention to a set of (physically localized) conventional 
physical computers (cf. Ex. 1 above), where the light cones in the set are arranged to allow the 
requisite information to reach the putative universal physical computer. 

FUTURE WORK AND DISCUSSION 

Any results concerning physical computation should, at a minimum, apply to the computer 
lying on a scientist's desk. However that computer is governed by the mathematics of determinis- 
tic finite automata, not that of Turing machines. In particular, the impossibility results concerning 
Turing machines rely on infinite structures that do not exist in any computer on a scientist's desk. 
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Accordingly, there is a discrepancy between the domain of those results and that of any truly gen- 
eral theory of physical computers. 

On the other hand, when one carefully analyzes actual computers that perform calculations 
concerning the physical world, one uncovers a mathematical structure governing those computers 
that is replete with its own impossibility results. While much of that structure parallels Turing 
machine theory, much of it has no direct analogue in that theory. For example, this new structure 
has no need for tapes, moveable heads, internal states, read/write capabilities, and the like, none 
of which have any obvious connection to the laws governing our universe (i.e., any connection to 
quantum mechanics and general relativity). 

In fact, when the underlying functions of real-world computers are stripped down to their 
essentials, one does not even need to identify a "computer" with a device occupying a particular 
localized region of space-time, never mind one with heads and the like. In place of all those con- 
cepts one has a structure involving several partitions over the space of all worldlines of the uni- 
verse. The partitions in that structure delineate a particular computer's inputs, the questions it 
addresses, and its outputs. The impossibility results of physical computation concern the relation 
of those partitions. Computers in the conventional, space-time localized sense (the box on your 
desk) are simply special examples, with lots of extra restrictions that turn out to be unnecessary in 
the underlying mathematics. Accordingly, the general definition of a "physical computer" has no 
such restrictions. A side-benefit of this breadth is that the associated mathematics can be viewed 
as concerning many information-processing activities (e.g., observation, control) normally viewed 
as distinct from computation. 

In the first paper in this pair, this definition of a physical computer was motivated and pre- 
sented, along with some associated theorems. Those theorems imply, amongst other things, that 
fool-proof prediction of the future is impossible — there are always some questions concerning 
the future that cannot even be posed to a computer, and of those that can be posed, there are 
always some for which the computer's answer will be wrong. By exploiting the breadth of the def- 
inition of physical "computation", similar results hold for the information-processing of observa- 
tion and of control. All of this is true even in a classical, non-chaotic, finite universe, and 
regardless of the where in the Chomsky hierarchy the computer lies. 
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This second paper launches from the theorems of the first paper into a broader, albeit prelimi- 
nary investigation of the mathematics of physical computation. It is shown that the computability 
structure relating distinct physical computers is that of a directed, acyclic graph. In addition, there 
is at most one computer (called a "god computer") that can predict /observe /control all other 
computers. Other results derived include limits on error-correction using multiple computers, and 
some analogues of the Halting theorem. 

Next a definition of the complexity of a particular computational task for a particular physical 
computer, prediction complexity, is motivated. The motivation of this new definition of complex- 
ity proceeds by analogy to the concept of the algorithmic information complexity of a symbol 
sequence for a universal Turing machine. However whereas algorithmic information complexity 
concerns a Turing machine's generating such a symbol sequence, prediction complexity involves 
a physical computer's addressing a computational task concerning the physical universe. 

The difference in prediction complexity of a particular task 71 for two different physical com- 
puters C 1 and C 2 is considered. It is proven that that complexity difference is bounded by a func- 

1 9 

tion that only depends on C and C , and is independent of %. This bound relating the difference in 
complexity for two physical computers is analogous to the algorithmic information complexity 
cost of emulating one universal Turing machine with another one. Finally, it is proven that either a 
certain kind of computation is not possible in our universe, or there is a preferred computer in our 
universe. If it exists, that computer could be used to uniquely specify the prediction complexity of 
any task 71. Accordingly, either a certain kind of computation is impossible, or there is a preferred 
definition of physical complexity (in contrast to the arbitrariness inherent in algorithmic informa- 
tion complexity's choice of universal Turing machine). 

The following ideas are just a few of the questions that the analysis of this paper raises: 

i) What other restrictions are there on the predictability relations within distinguishable sets of 
physical computers beyond that they form unions of DAG's? In other words, which unions of 
DAG's can be manifested as the predictability relations within a distinguishable set? How does 
this answer change depending on whether we are considering sets of fully input-distinguishable 
computers or sets of pairwise-distinguishable computers? For what computers are there finite / 
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countably infinite / uncountably infinite numbers of levels below it in the DAG to which it 
belongs? Might such levels be gainfully compared to the conventional computer science theory 
issue of position in the Chomsky hierarchy? 

ii) One might try to characterize the unpredictability-of-the-future result of paper I as the physical 
computation analogue of the following issue in Turing machine theory. Can one construct a Tur- 
ing machine M that can take as input A, an encoding of a Turing machine and its tape, and for any 
such A compute what state As Turing machine will be in after will be in after n steps, and per- 
form this computation in fewer than n steps? This characterization suggests investigating the for- 
mal parallels (if any) between the results of these papers and the "speed-up" theorems of 
computer science. 

iii) More speculatively, the close formal connection between the results of this second paper and 
those of computer science theory suggest that it may be possible to find physical analogues of 
most of the other results of computer science theory, and thereby construct a full-blown "physical 
computer science theory". In particular, it may be possible to build a hierarchy of physical com- 
puting power, in analogy to the Chomsky hierarchy. In this way we could translate computer sci- 
ence theory into physics, and thereby render it physically meaningful. 

We might be able to do at least some of this even without relying on the DAG relationship 
among the physical computers in a particular set. As an example, we could consider a system that 
can correctly predict the future state of the universe from any current state of the universe, before 
that future state occurs. The behavior of such a system is perfectly well-defined, since the laws of 
physics are fully deterministic (for quantum mechanics this statement implicitly presumes that 
one views those laws as regarding the evolution of the wave function rather than of observables 
determined by non-unitary transformations of that wave function). Nonetheless, by the central 
unpredictability result of paper I, we know that such a system lies too high in the hierarchy to 
exist in more than one copy in our physical universe. 
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With such a system identified with an oracle of computer science theory we have the defini- 
tion of a "physical" oracle. Can we construct further analogues with computer science theory by 
leveraging that definition of a physical oracle? In other words, can we take the relationships 
between (computer science) oracles, Turing machines, and the other members of the (computer 
science) Chomsky hierarchy, and use those relationships together with our (physical) oracle and 
physical computers to gainfully define other members of a (physical) Chomsky hierarchy? 

iv) Can we then go further and define physical analogues of concepts like P vs. NP, and the like? 
Might the halting probability constant Q. of algorithmic information theory have an analogue in 
physical computation theory? 

As another example of possible links between conventional computer science theory and that 
of physical computers, is there a physical computer analogue of Berry's paradox? Weakly predict- 
ing a partition is the physical computation analogue of "generating a symbol sequence" in algo- 
rithmic information complexity. The core of Berry's paradox is that there are numbers k such that 
no Turing machine can generate a sequence having algorithmic information complexity k (with 
respect to some pre-specified universal Turing machine U). So for example one closely related 
issue in physical computation is to characterize the physical computers C 1 and x e 9t such that 3 
a computer C 2 where C 1 » C 2 and where V partitions 71, C 2 weakly predicts whether c(tc I C 1 ) > 
x (i.e., such that 3 IN 2 e {IN 2 } such that IN 2 ( u ) = IN 2 => OUT 2 p ( u ) = (B, whether c(7t I C 1 ) > 
x)). 

v) Concerns of computer science theory, and in particular of the theory of Turing machines, have 
recently been incorporated into a good deal of work on the foundations of physics [33 } . Future 
work involves replacing physical computers for Turing machines in this work, along with replac- 
ing notions like prediction complexity for notions like algorithmic complexity. 

vi) Other future work involves investigating other possible definitions of complexity for physical 
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computation. Even sticking to analogues of algorithmic information complexity, these might 
extend significantly beyond the modifications to the definition of prediction complexity discussed 
in the text. For example, one might try to define the analogue of a bit sequence's "length" in terms 
of the number of elements in Q. One might also take the (inverse) complexity of a computational 
device to be the number of input-distinguishable computers that can predict that device (working 
in some pre-specified input-distinguishable set, presumably). 

vii) Yet other future work includes calculating physical complexity of various systems for some of 
the simple physical models of real-world computers (e.g., "billiard ball" computers, DNA com- 
puting, etc.) that have been investigated, and investigating the prediction complexity of systems 
like crystals and gases. 



FOOTNOTES 



[1] Especially for non-binary 71, many other definitions of prediction complexity besides Def. 
11 (ii) can be motivated. For example, one could reasonably define the complexity of % to be the 
sum of the complexities of each binary partition induced by an element of 7t, i.e., one could define 

A A 

it as S pGJC c({ U6 p, u £ p} I C). Another variant, one that would differ from the one considered 
in the text even for binary partitions, is min pGC -l( jr ) [£rN G p KIN)]. For reasons of space, no such 
alternatives will be considered in this paper. 
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